Sharpened Error Bounds for Random Sampling Based $\ell_2$ Regression
نویسنده
چکیده
Abstract. Given a data matrix X ∈ R and a response vector y ∈ R , suppose n > d, it costs O(nd) time and O(nd) space to solve the least squares regression (LSR) problem. When n and d are both large, exactly solving the LSR problem is very expensive. When n ≫ d, one feasible approach to accelerating LSR is to randomly embed y and all columns of X into the subspace R where c ≪ n; the induced LSR problem has the same number of columns but much fewer number of rows, and the induced problem can be solved in O(cd) time and O(cd) space. The leverage scores based sampling is an effective subspace embedding method and can be applied to accelerate LSR. It was shown previously that c = O(dǫ log d) is sufficient for achieving 1 + ǫ accuracy. In this paper we sharpen this error bound, showing that c = O(d log d + dǫ) is enough for 1 + ǫ accuracy.
منابع مشابه
Least-Squares Regression on Sparse Spaces
Another application is when one uses random projections to project each input vector into a lower dimensional space, and then train a predictor in the new compressed space (compression on the feature space). As is typical of dimensionality reduction techniques, this will reduce the variance of most predictors at the expense of introducing some bias. Random projections on the feature space, alon...
متن کاملRevisiting the Nystrom method for improved large-scale machine learning
We reconsider randomized algorithms for the low-rank approximation of symmetric positive semi-definite (SPSD) matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our resul...
متن کاملLecture 15 : Additive - error Low - rank Matrix Approximation with Sampling and Projections
• A spectral norm bound for reconstruction error for the basic low-rank approximation random sampling algorithm. • A discussion of how similar bounds can be obtained with a variety of random projection algorithms. • A discussion of possible ways to improve the basic additive error bounds. • An iterative algorithm that leads to additive error with much smaller additive scale. This will involve u...
متن کاملOnline Active Linear Regression via Thresholding
We consider the problem of online active learning to collect data for regression modeling. Specifically, we consider a decision maker with a limited experimentation budget who must efficiently learn an underlying linear population model. Our main contribution is a novel threshold-based algorithm for selection of most informative observations; we characterize its performance and fundamental lowe...
متن کاملBayesian Error Based Sequences of Mutual Information Bounds
The inverse relation between mutual information (MI) and Bayesian error is sharpened by deriving finite sequences of upper and lower bounds on MI in terms of the minimum probability of error (MPE) and related Bayesian quantities. The well known Fano upper bound and Feder-Merhav lower bound on equivocation are tightened by including a succession of posterior probabilities starting at the largest...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014